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1. INTRODUCTION 

In December 2019, a novel coronavirus called 2019-nCoV ("n" stands for a novel) spread. This 
virus causes severe acute respiratory (SARI) symptoms, including dyspnea, fever, asthenia as well as 
pneumonia. The virus spread among Chinese people (specifically in Wuhan). The first batch of Chinese 
infected people is all roughly associated with the seafood market in Wuhan that also trades wild animals. 
Then, contact transmission of 2019-nCoV was confirmed among humans and the number of infected Chinese 
people increased rapidly not only in Wuhan but also in other major cities in China. Many actions have been 
taken by the Chinese government to avoid and control the pandemic, but the coronavirus spread rapidly 
outside China and moves out to the world. Like any place in the world, many COVID-19 positive cases 
appeared in Iraq. The number of individuals that contracted the virus increased sharply and continues to 
evolve rapidly. On 23 June, the positive cases reaching 34,502 patients were associated with a significant 
increase in the number of deaths. Roughly 40% of these cases were identified in Baghdad [1]-[5]. Figure 1 
shows the number of recovered and death COVID-19 cases [6]. Figure 2 shows the number of death cases in 
Iraq [7]. 

The diagnosis of coronavirus (COVID-19) nowadays is a critical task for the medical practitioner, 
especially with an increased number of patients and a variety of symptoms. The test of COVID-19 in Iraq is 
currently a difficult task due to the unavailability of the diagnosis system in every city, which is causing 
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delays in disease detection. Due to the limited supply of COVID-19 testing kits, other diagnosis measures are 
needed to rely on. Since COVID-19 attacks the epithelial cells that line our respiratory tract, we can use 
computed tomography (CT) scan images to analyze the health of a patient’s lungs and since CT images are 
used frequently to diagnose lung inflammation, pneumonia, abscesses, and/or enlarged lymph nodes and 
nearly all hospitals have CT imaging machines, these machines are used to test COVID-19 suspected cases 
instead of dedicated test kits. The main problem of using CT image is that the analysis of CT images requires 
a radiology expert, and it takes a long time to have a diagnosis which is a luxury that sick people don’t have. 
Therefore, it is necessary to develop an automatic system capable of analyzing the CT images to save time 
and have faster diagnosis [8]—[11]. 


CORONAVIRUS (COVID-19) 
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Figure 1. Statistics of COVID-19 in some countries 
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Figure 2. Number of COVID-19 death cases in Iraq 


During the current corona pandemic, many classification systems based on machine learning (ML) 
and deep learning (DL) presented grand proof of success in the scope of understanding medical images due 
to their high classification and feature extraction capabilities [8]. For example, Li et al. [12] used two CT 
lung images, uCT and uMI scanners (United Imaging), to examine the relationship between the 
manifestations of CT imaging and the clinical categorization of COVID-19. They conducted a retrospective 
single-center study on 78 patients (38 males and 40 females) with COVID-19 from 18 January 2020 to 7 
February 2020 in Zhuhai city, China that divided the patients into three types based on Chinese guidelines. 
The first type is mild included patients with negative CT image findings and minimal symptoms. The second 
type is common, and the third one is severe-critical included patients with different extent of clinical 
manifestations and positive CT image findings. They used different scores to CT visual quantitative 
estimation based on summing up the acute lung inflammatory lesions comprising each lobe. The total 
severity score (TSS) was compared with the clinical classification. The cutoff point of TCC is 7.5 that 
yielded 82.6% sensitivity and 100% specificity. They concluded that the ratio of COVID-19 patients of mild- 
type was comparatively high; CT images were not appropriate as an independent screening tool. The visual 
quantitative analysis of CT image has high consistency and consequently can reflect the clinical 
categorization of COVID-19. 

Narin et al. [13] used three different convolutional neural net (CNN)-based models (ResNet50, 
InceptionV3, and Inception- ResNetV2) to detect infected people with coronavirus pneumonia and utilized 
chest X-ray radiographs. They used ResNet50 to pre-trained the model. The classification accuracies were 
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98%, 97% and 87% for ResNet50, InceptionV3 and Inception-ResNetV2 respectively. The experimental 
result was based on using a 100 chest X-ray images dataset (50 images of normal cases, and 50 COVID-19 
patients). The authors concluded that using COVID-19 automatic detection can help doctors to detect 
coronavirus at an early stage, and consequently appropriate decisions can be taken based on the high 
performance of the classification model. Barstugan et al. [14] used different CT tools to extract the 
coronavirus image set. These images dataset was collected manually included 150 CT abdominal images that 
belong to 53 infected cases from the Societa Italiana di Radiologia Medica e Interventistica. To extract image 
features, five feature extraction approaches (grey level co-occurrence matrix, grey level size zone matrix, 
grey level run length matrix, local directional pattern, and discrete wavelet transform) were utilized to 
exclude the irrelevant feature set. The classification accuracy achieved was 99.68%. 

Behera and Kumari [15] used deep learning for the detection of coronavirus infected people utilized 
X-ray images. The features extracted from the deep feature of nine pre-trained CNN models and passed to the 
support vector machine (SVM) classifier model individually. Two datasets utilized in this study, the first 
dataset included 25 positive cases and 25 negative cases that were collected from the GitHub repository 
shared by Cohen [16] and Kaggle repository [17]. While the second dataset included 266 (133 positive cases 
and 133 negative cases) that were collected from the open-i repository [18]. The classifier system achieved 
95.52%, 95.38% of F-Score, and accuracy respectively, for detecting COVID-19 disease (ignoring acute 
respiratory distress syndrome (ARDS), Middle East respiratory syndrome (MERS), and severe acute 
respiratory syndrome (SARS)). Song et al. [19] collected 275 CT scan images from two hospitals in China 
distributed to 88 positive COVID-19 cases, 101 images for patients infected with bacteria pneumonia, and 86 
images for healthy persons. They used four deep learning models (VGG-16, DRENet, ResNet, and DRE-Net) 
for Pneumonia classification. The best f-score result was 87% in the test set. Abdulmunem et al. [20] also 
used COVID-19 X-ray images of the Kaggle dataset to train the ResNet50 deep learning network. The best 
results were obtained with 5 folds cross-validation with an accuracy rate of 97.28%. In the work of [8], deep 
learning network modification was used to detect Covid-19 positive cases by extracting features from CT 
images and chest X-rays. Two essential phases were adopted. In the first one, many transfer-learning models 
were applied, while the second phase used a VGG-19 model to find the best results for disease diagnosis. 
1,000 images were used for evaluation VGG-19 model that achieved 99% of accuracy, 97.4% of sensitivity, 
and 99.4% of specificity. 

The contribution of this paper is to develop an automated diagnostic system using VGG-16 capable 
of analyzing the COVID-19 either positive or negative cases from radiology images and consequently 
obtaining a rapid and accurate diagnosis. The remaining sections of this paper are structured as the following: 
Materials and methods will be shown in section 2. In section 3, the experimental results and discussions will 
be detailed. Finally, the conclusions are shown in section 4. 


2. MATERIALS AND METHODS 
2.1. VGG-16 model 

Simonyan et al. [21] proposed VGG-16 architecture as a CNN model. They used this model to win 
ILSVR (ILSVRC (Imagenet large scale visual recognition challenge)) competition in 2014. VGG-16 consists 
of sixteen-layer network and is considered one of the best vision model frameworks to date. VGG-16 model 
yielded 92.7% top-5 test accuracy in ImageNet dataset that included over fourteen million images distributed 
into 1,000 classes. To train VGG-16, many weeks were required. As shown in Figure 3, the VGG-16 
structure of the layers can be summarized as follows: First and second layers: The input image is 224x224x3 
passed through a stack of first and second convolutional layers with 64 feature maps or 3x3 filters and stride 
14 for the same pooling. The dimensions of the images will be changed to 224x224x64. Then, the maximum 
pooling layer or layer of sub-sampling will be applied in the VGG-16 with a filter size 3x3 and stride 2. The 
dimensions of the resulting image will be minimized to 112x112x64. Third and fourth layer: After that, two 
convolutional layers are applied with 128 feature maps having filtering size 3x3 with a stride of 1. Next, a 
maxpooling layer with filter size 3x3 with a stride of 2 is implemented, which consequently reduces the 
dimension of the resulting image to 56x56x128. Fifth and sixth layers: These two layers are convolutional 
layers having filter size 3x3 with a stride of one. Both layers utilized 256 feature maps. The next layer of 
these convolutional layers is the layer of maximum pooling having filter size 3x3 with a stride of two and 
256 feature maps. Seventh to twelfth layer: These convolutional layers that followed by a maximum pooling 
layer having 512 filters of size 3x3 with a stride of one. The final dimension will be decreased to 7x7x512. 
Thirteenth layer: the fully connected (FC) layers are used to flatten the convolutional layer output with 
25,088 feature maps each of size 1x1. Fourteenth and fifteenth layers: These layers consist of two fully 
connected layers with 4096 units. Output layer: The final layer is the SoftMax output layer with 1,000 classes 
[22], [23]. 
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Figure 3. VGG-16 neural network architecture 


2.2. Proposed method 

Image classification is the process of automatically assigning a class label to the new images based 
on pre-defined patterns created from labeled data. The classification system divides the data into training and 
testing phases. During the training phase, optimal parameters are obtained and used in the testing phase to 
predict the label to the new images. The adopted classification system consists of four main phases as shown in 
Figure 4: pre-processing stage, deep CNN feature extraction phase, classification phase, and evaluation 
phase. Processing is an important stage used to prepare the images for the next phase. Next, feature extraction 
is applied to extract the features and excluding the unimportant features using deep learning filtering. Then, 
the extracted features will be passed to the classifiers to find the best results. Finally, the evaluation is applied 
to compute the classifier performance. In the following subsections, these phases are explained. 
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Figure 4. The adopted classification system 
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2.2.1. Basrah dataset 

The data used in the current study collected from Al-Sadr educational hospital in Basrah, Iraq. 
These video data were collected manually for Iraqi patients in Basrah City and called (BasrahDataset) by a 
specialist in Infectious diseases during the period from April to June 2020 and included the chest X-ray 
(CXT), and computed tomography (CT) scan images. BasrahDataset includes 50 cases distributed into 30 
males and 20 females with two positive and negative COVID-19 cases. The age of the patients ranged 
between 18 and 65. The total images are roughly 1,423 (1,181 positive cases, and 242 negative cases). These 
data are confirmed by a clinical picture in addition to polymerase chain reaction test (PCR). The use of this 
data in the current study is based on the official approval document issued by the Al-Sadr educational 
hospital. The automatic detection system is built based on the above expert experience by dividing the data 
into two groups: training and testing. Training data comprised approximately 818 images with 694 images 
confirmed COVID-19 cases. While 124 images are negative cases. On the other hand, 487, and 118 positive 
and negative cases are used for testing (605 images in total) respectively. Figure 5 shows an example of 
BasrahDataset CT images. The first one is used to learn the extracted deep feature to build the classifier 
model using VGG-16 and another one is to identify the COVID-19 infected patient automatically by using 
this built model. Keras package in python language (version 3.7.4 64-bit) used to design VGG Model. 
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Figure 5. An Example of CT Images 


2.2.2. Preprocessing phase 

Preprocessing is an important stage used to prepare the image for the next step. Preprocessing is a 
set of processes that applied to the image to exclude the noise and extract the region of interest (ROI). In the 
first step, the CT image is transformed into the grayscale and resizes the images. All the images in the dataset 
will be passed as input to the VGG-16 neural networks model of the same size. So, all images need to be 
resized into a fixed size with less shrinking to avoid classification accuracy degradation due to deformations. 
In addition to that, the amount of the required memory and computational operations for image processing 
also reduced [24], [25]. 


2.2.3. Deep CNN features extraction 

A deep CNN framework is composed of different layers used as a feature extractor. Earlier CNN 
layers have more low-level features compared with the highest level (convolution layer or pooling layer). The 
CNN hidden layers consist of one or more convolutional layers each follow up by a pooling layer in a 
sequential manner and follow up by one or more fully connected (FC) layers. The CNN convolutional layers 
are used to extract the relevant features, while the last FC layer is used as a classifier. The convolutional 
layers comprised form two different layers: the filter bank layer as well as the nonlinearity layer. The features 
are mapped as a matrix and passed as input to the convolutional layers. The matrix dimensions are WxHx3, 
where W and H are the width and height respectively, and 3 (three-color channeled RGB image) is the 
number of feature maps. The layers of the filter bank include multiple trainable kernels associated with each 
feature map. Each kernel capable of identifies a specific feature from the input matrix at every location. on 
the other hand, the nonlinearity layer implements on the output a nonlinear activation function from the filter 
bank layer. After that, the pooling layers are applied to sub-sampling for each feature map in order to 
decrease the map resolution. Then, the output of the convolutional layers is passed to FC layers. During FC 
layers, the final decisions based on different weighted combinations of the inputs are making to determine the 
class that the image belongs to [25]. 


2.2.3. Classification phase 

In this phase, the classification process is discussed. Essentially, it is required to have two main 
combined steps in to classify images obtained by CT scan. Those images will be utilized to recognize the 
infected patients with COVID-19. Firstly, deep features that are extracted from the CNN model based on the 
CT scan images will be the inputs data to a support vector machine classifier with a linear-kernel function. 
Secondly, the trained classifier is applied to the test images by feeding the features obtained from the 
previous layers to the SVM classifier. This in turn will lead to identify COVID-19 infections which will be 
either positive or negative. 


2.2.4. Evaluation phase 

The performance of the classification can be evaluated by using precision, recall, and Fl-score. The 
classification system performance was measured with the Fl-score and the Accuracy calculated by the 
following formula [26]—[28]: 

Accuracy = (TP +TN)(TP + FP +TN +FN) (1) 

F — score = (2 x Recall x Precision) /(Recall + Precision) (2) 
where true positive (TP) indicates the number of correctly cases that recognized for the class, False positive 


(FP) indicates the number of correctly recognized cases which do not belong to the class. True negative (TN) 
indicates the number of cases that were incorrectly assigned to the class, and false negative (FN) is the cases 
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that were not recognized as class cases. Precision is the number of correctly classified positive cases that can 
be divided by the number of cases labeled by the system as positive. 


Precision = TP/(TP + FP) (3) 


And recall (Sensitivity) is the number of correctly classified positive cases divided by the number of positive 
cases in the data. Specificity is a measure of how well the program distinguishes the case of patients that do 
not have the COVID-19 disease. 


Recall = TP /(TP + FN) (4) 


Specificity = TP /(FP + TN) (5) 


3. RESULTS AND DISCUSSION 

The convolutional neural network (CNN) architecture is executed with many layers such as 
convolutional, rectified linear unit (ReLU), and pooling. The ReLU activation function is used in hidden 
layers and SoftMax is used in the output layer. Also, average and fully connected layers are used. Further 
layers are used like dropout that added to the network to improve the classifier performance during the 
training phase. This layer is activated only in the training phase to drops a certain number of neurons 
randomly during the forward pass. The non-dropped neurons are updated during the backward pass. The 
main purpose of dropout is to bring the regularization to learn the model with a robust feature and avoids 
overfitting during the training phase. The input CT images are passed into the CNN detection pipelines that 
started with deep feature extraction and ended with making decisions. The detection performance in the 
current study is evaluated utilized the VGG-16 pipeline CNN based on the loss-accuracy curves to obtain the 
best class. The BasrahDataset is used for evaluation. The total images are roughly 1,423 that are divided into 
two groups: training, and testing. Roughly 818 CT images were used for training and validation (654 images 
for training and 164 for validation). Then, 605 CT images are used to evaluate the pre-trained model. 
Figure 6 shows the experimental results with the best accuracy and loss. The right prediction rate of a dataset 
of trained or validated images is represented by a point in the accuracy curve. It can be noted that the 
accuracy of the training dataset is around 100% after 10 epochs as well as the validation set accuracy. The 
test accuracy of BasrahDataset testing images is 99%. 

VGG-16 achieved the highest sensitivity of 99%. Similarly, the specificity indicates the true 
negative rate. Besides, the adopted classification system achieved better results compared to the related 
works in terms of data type, data size, and used classifier. Table 1 shows other researchers’ work using 
different deep learning techniques and datasets for the prediction of COVID-19. The advantage of our work 
is using a large number of the dataset with high accuracy. 
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Figure 6. Loss and accuracy on basrahdataset 
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Table 1. Comparisons between our proposed system and some related works 


Reference Classifier Type of data Amount of data Classifier performance 
Kamil [8] VGG-19 CT images and chest X-ray 1,000 images 99% accuracy 
Li et al. [12] Statistical analysis CT chest images 1,540 images 82.6% Sensitivity, 
100% Specificity 
Narin et al. [13] CNN Chest X-ray images 100 images 98% Accuracy 
Barstugan et al. [14] SVM CT Chest images 150 images 99% Accuracy 
Sethy et al. [15] CNN Chest X-ray images 316 images 95.52% F-score, 95.38% Accuracy 
Song et al. [19] VGG-16 CT Chest images 275 images 84% F-score and Accuracy 
Abdulmunem et al. [20] ResNet50 Chest X-ray images 50 images 97.28% accuracy 
Current Study VGG-16 CT Chest images 1,423 images 99% F-score and Accuracy 


The main contributions of the current study can be summarized in few points which are: the 
proposed system does not suffer from data imbalance, the utilized VGG-16 model was trained with a large 
number of CT scan images of COVID-19 compared to the previous studies. Besides the proposed system is 
fully automated diagnosis system that did not need anyprior operation to extract features. Even though the 
above-mentioned advantages, there are some limitations in the proposed system such as the current system 
needs training different types of respiratory diseases, this considers as future work to improve our system. 
Therefore, the current system is only diagnosed COVID-19 infected people compared with healthy 
individuals and it is unable to diagnose other types of pneumonia and respiratory diseases. 


4. CONCLUSION 

In this paper, deep classification learning is adopted to identify COVID-19 CT images. These 
images collected from Iraqi patients in Basrah city consisted of 1,423 CT images for positive and negative 
COVID-19 cases. The classification model extracts the features from the pre-trained dataset. Then, the SVM 
classifier was used to recognize the coronavirus cases. The classification system achieved relatively high 
performance on BasrahDataset CT images. Besides that, results were also compared with some related works 
in future work, we can develop the system to diagnose some other respiratory disease, also to detect the level 
of coronavirus infection. 
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